22.02.22 - Machine Learning_ Techniques

Types of Tasks

Supervised Learning - Function from labelled training data, each example consisting of an input and a desired output (Classifications, regression)

Unsupervised Learning - Function to describe hidden structure from unlabelled data. There is basically no output

Classification - Learn to predict to which set a instance belongs to based on pre-labeled (classified) instances Supervised learning, many approaches: regression, decision trees, neural networks

Supervised Learning

Regression

Estimated relationship between variable Y and variable(s) X Function: Based on the given data to minimise its mean squared error (MSE) to 'fit' the data

Decision Tree

Internal Nodes: decision rules on features (decision variables, input) Leaf Nodes: a predicted class label(output) |Pros|Cons| |----|----| |Reasonable training time| Only simple decision boundaries| |Handle large features|Problem with lots of missing data| |Easy to implement|Cannot handle complicated relationship| |Easy to interpret|

Neural Networks

Set of neurons connected by directed weighted edges |Pros|Cons| |----|----| |Cam learn more complicated class boundaries|Hard to implement| |Can be more accurate| Slow training time| |Can handle large number of features|Can over fit the data| | |Hard to interpret|

Supervised Learning: Applications

Clustering

Given: Un-labeled data set D and Similarity/distance metric
Goal: Find 'natural' partitioning, or groups of similar data points
Application: Divide a market into distinct subsets of customers

Association Rules

Discover correlation between any two or more variables

Given: Set of records containing items
Goal: Produce dependency rules, to predict occurrence of one variable based on occurrences of another variable
Application: Market basket analysis

Correlation $\ne$ Causation

Types of Tasks​

Supervised Learning​

Regression​

Decision Tree​

Neural Networks​

Supervised Learning: Applications​

Clustering​

Association Rules​